Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The astcenc vector library effectively implements two different class APIs:
vfloat4
) in the codec.vfloat
) in the codec, and where the width is resolved at compile time.For historical reasons the classes that are only used as a VLA classes (e.g.
vfloat8
for AVX2) implement a lot of functionality which was inherited from the original 4-wide implementation and not actually used in the VLA parts of the codec. This makes adding new VLA implementation (e.g. Arm SVE) more expensive than it needs to be.This PR doesn't add SVE support, but does some cleanup to minimize the vector library API as a precursor to doing so. The main changes are:
.lane<0>()
with dedicated scalar function returns e.g. usehmax_s()
rather thanhmax.lane<0>()
. This was beeing done in places before, but was not done consistently. Now this pattern is used everywhere.